Research question: Does happiness differ between the Swiss regions?
2022/09/29
Research question: Does happiness differ between the Swiss regions?
Variables:
Nuts2: Large RegionsH1 (variable name not a hypothesis): Q1 How happy or unhappy [1 Completely happy - 7 Completely unhappy]H1 is obviously ordinal - can mean even be appropriate?
Hypothesis 1: The respondents from the 7 regions reported different mean happiness levels.
Hypothesis 2: Respondents from Espace Mittelland reported higher mean happiness levels than Zentralschweiz.
| Nuts2 | n | mean | trimmed10 | median | sd | var | skew | kurt |
|---|---|---|---|---|---|---|---|---|
| Région lémanique | 570 | 2.91 | 2.91 | 3 | 0.92 | 0.85 | 0.17 | 3.58 |
| Espace Mittelland | 720 | 2.72 | 2.72 | 3 | 0.89 | 0.79 | 0.36 | 3.01 |
| Nordwestschweiz | 427 | 2.70 | 2.70 | 3 | 0.88 | 0.78 | 0.47 | 4.05 |
| Zürich | 513 | 2.76 | 2.76 | 3 | 0.97 | 0.95 | 0.59 | 3.93 |
| Ostschweiz | 433 | 2.73 | 2.73 | 3 | 0.95 | 0.90 | 0.75 | 4.56 |
| Zentralschweiz | 275 | 2.61 | 2.61 | 3 | 0.85 | 0.73 | 0.62 | 4.00 |
| Ticino | 160 | 2.91 | 2.91 | 3 | 0.98 | 0.95 | 1.16 | 6.14 |
Box plots are excellent to display distributions.
Why are they not a good choice in case?
WARNING: depending on the bin size histograms can be misleading.
Quantile-Quantile-plots are a great way to compare the sample distribution to a theoretical distribution. Ideally, the points would match the line.
Why do we see a stair pattern?
oneway.test(H1~Nuts2,var.equal=FALSE, data=df_1f)
## ## One-way analysis of means (not assuming equal variances) ## ## data: H1 and Nuts2 ## F = 5.2507, num df = 6.0, denom df = 1030.9, p-value = 2.434e-05
Levine and Hullett (2002) recommend ω² or η² as effect size for ANOVAs.
aov(H1~Nuts2, data=df_1f) %>% effectsize::omega_squared(verbose=F) %>% toTable()
| Parameter | Omega2 | CI | CI_low | CI_high |
|---|---|---|---|---|
| Nuts2 | 0.0079776 | 0.95 | 0.002278 | 1 |
Hypothesis 1: The respondents from the 7 regions reported different mean happiness levels. –> Null-Hypothesis can be rejected, but the effect is negligible
The binning of effect sizes are just rules of thumb and somewhat arbitrary.
| Cohen (1992) | Field (2013) | |
|---|---|---|
| very small | < 0.02 | < 0.01 |
| small | < 0.13 | < 0.06 |
| medium | < 0.26 | < 0.14 |
| large | >= 0.26 | >= 0.14 |
When rating the effect size, consider the customs of your (sub-)domain and, more importantly, the size of other known effects on your dependent variable.
The R package effectsize includes various rules to help with the interpretation.
effectsize::interpret_omega_squared(0.008, rules = "cohen1992")
## [1] "very small" ## (Rules: cohen1992)
The typical way of testing Hypothesis 2 ( Espace Mittelland happier than Zentralschweiz) is with a linear contrast (but this is NOT the recommended way).
f1_emm <- lm(H1~Nuts2, data=df_1f) %>% emmeans::emmeans('Nuts2', data=df_1f)
emmeans::test(
emmeans::contrast(f1_emm, list(ac1=c(0, 1, 0, 0, 0, -1, 0))),
adjust='none')
## contrast estimate SE df t.ratio p.value ## ac1 0.114 0.0651 3091 1.753 0.0797
Note 1: This analytic contrast tests a distinct hypothesis; hence no p-adjustment is needed. Comparisons without specific hypotheses (e.g., orthogonal contrasts) would need an adjustment of the significance level (e.g., False Discovery Rate)
Note 2: This analytic contrast is 2-sided, but H2 is 1-sided -> p needs to be halved
BUT: Linear contrasts are very sensitive to variance heterogeneity. Jan & Shieh (2019) recommend Welch’s t-test instead.
Perform Welch’s t-test
df_1f %>%
filter(Nuts2 %in% c('Espace Mittelland', 'Zentralschweiz')) %>%
t.test(H1~Nuts2, data=., alternative='greater')
## ## Welch Two Sample t-test ## ## data: H1 by Nuts2 ## t = 1.866, df = 513.61, p-value = 0.03131 ## alternative hypothesis: true difference in means between group Espace Mittelland and group Zentralschweiz is greater than 0 ## 95 percent confidence interval: ## 0.01333942 Inf ## sample estimates: ## mean in group Espace Mittelland mean in group Zentralschweiz ## 2.725000 2.610909
Get effect size
df_1f %>%
filter(Nuts2 %in% c('Espace Mittelland', 'Zentralschweiz')) %>%
mutate(Nuts2 = forcats::fct_drop(Nuts2)) %>%
effsize::cohen.d(H1~Nuts2, data=.)
## ## Cohen's d ## ## d estimate: 0.1299925 (negligible) ## 95 percent confidence interval: ## lower upper ## -0.009234415 0.269219496
Hypothesis 2: Respondents from Espace Mittelland reported higher mean happiness levels than Zentralschweiz.
–> the Null-Hypothesis can be rejected, but the effect is negligible
Games-Howell Modification of the Tukey Test
Works with unequal samples sizes and heterogeneity of variance.
rstatix::games_howell_test(df_1f, H1~Nuts2, conf.level = 0.95, detailed = FALSE)
## # A tibble: 21 × 8 ## .y. group1 group2 estimate conf.…¹ conf.h…² p.adj p.adj…³ ## * <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> ## 1 H1 Région lémanique Espace Mit… -0.189 -0.339 -0.0388 4 e-3 ** ## 2 H1 Région lémanique Nordwestsc… -0.218 -0.389 -0.0482 3 e-3 ** ## 3 H1 Région lémanique Zürich -0.158 -0.328 0.0130 9.2 e-2 ns ## 4 H1 Région lémanique Ostschweiz -0.182 -0.358 -0.00555 3.8 e-2 * ## 5 H1 Région lémanique Zentralsch… -0.303 -0.494 -0.113 6.31e-5 **** ## 6 H1 Région lémanique Ticino -0.00779 -0.264 0.249 1 e+0 ns ## 7 H1 Espace Mittelland Nordwestsc… -0.0294 -0.189 0.130 9.98e-1 ns ## 8 H1 Espace Mittelland Zürich 0.0313 -0.129 0.191 9.97e-1 ns ## 9 H1 Espace Mittelland Ostschweiz 0.00710 -0.159 0.173 1 e+0 ns ## 10 H1 Espace Mittelland Zentralsch… -0.114 -0.295 0.0669 5.04e-1 ns ## # … with 11 more rows, and abbreviated variable names ¹conf.low, ²conf.high, ## # ³p.adj.signif
Pairwise Welch t-tests with alpha adjustment
pairwise.t.test(df_1f$H1, df_1f$Nuts2, data=df_1f, pool.sd=TRUE, p.adj="fdr")
## ## Pairwise comparisons using t tests with pooled SD ## ## data: df_1f$H1 and df_1f$Nuts2 ## ## Région lémanique Espace Mittelland Nordwestschweiz Zürich ## Espace Mittelland 0.00171 - - - ## Nordwestschweiz 0.00171 0.69941 - - ## Zürich 0.01678 0.69105 0.43712 - ## Ostschweiz 0.00797 0.92449 0.69105 0.75808 ## Zentralschweiz 0.00015 0.13945 0.34979 0.07963 ## Ticino 0.92449 0.06290 0.04002 0.13636 ## Ostschweiz Zentralschweiz ## Espace Mittelland - - ## Nordwestschweiz - - ## Zürich - - ## Ostschweiz - - ## Zentralschweiz 0.14054 - ## Ticino 0.08487 0.00644 ## ## P value adjustment method: fdr